Improving Access to Digitized Historical Newspapers with Text Mining, Coordinated Models, and Formative User Interface Design

نویسنده

  • Robert B. Allen
چکیده

Most tools for accessing digitized historical newspapers emphasize relatively simple search; but, as increasing numbers of digitized historical newspapers and other historical resources become available, we can consider much richer modes of interaction with these collections. For instance, users might use exploratory search for looking at larger issues and events such as elections and campaigns or to get a sense of “the texture of the city... how the city was thinking.” To take full advantage of rich interface tools, the content of the newspapers needs to be described systematically and accurately. Moreover, collections of multiple newspapers need to be richly cross-indexed across titles and even with historical resources beyond the newspapers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Full - Text Access to Historical Newspapers Tapas Kanungo and

Newspapers are rich records of U.S. history. Due to the deterioration of older newspapers, the National Endowment for the Humanities is archiving 19th century newspapers on microfilm. Although microfilm is a good preservation method, it provides limited access to researchers and the general public. We are building a system to provide universal access to digital images and full-text content of h...

متن کامل

A Framework for Text Processing and Supporting Access to Collections of Digitized Historical Newspapers

Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we desc...

متن کامل

Automated Processing of Digitized Historical Newspapers beyond the Article Level: Sections and Regular Features

Millions of pages of historical newspapers have been digitized but in most cases access to these are supported by only basic search services. We are exploring interactive services for these collections which would be useful for supporting access, including automatic categorization of articles. Such categorization is difficult because of the uneven quality of the OCR text, but there are many clu...

متن کامل

The leveled approach. Using and evaluating text mining tools AVResearcherXL and Texcavator for historical research on public perceptions of drugs

In our research on public perceptions of drugs in Dutch newspapers we have developed a leveled explorative historical research approach. We employ digital tools as “signposts” that indicate existing debates in newspapers that can be interpreted historically using a hermeneutic approach. Conceptualizing the ways we use text-mining tools as historians helps to align user needs with technological ...

متن کامل

Methods for User-Centered Design and Evaluation of Text Analysis Tools in a Digital History Project

This paper reports on the user centered, formative evaluation of tools and the validation of models for the analysis of historical textbooks in the context of the digital history project Children and their World. The goal of the project is to create methods for computer-supported, interactive analysis that can be applied to a large corpus of historical textbooks on history and geography (~5000 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1502.03943  شماره 

صفحات  -

تاریخ انتشار 2010